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Abstract 

Our work considers the optimization of the sum of a non-smooth convex function and a 
finite family of composite convex functions, each one of which is composed of a convex func¬ 
tion and a bounded linear operator. This type of problem is associated with many interesting 
challenges encountered in the image restoration and image reconstruction fields. We developed 
a splitting primal-dual proximity algorithm to solve this problem. Further, we propose a pre¬ 
conditioned method, of which the iterative parameters are obtained without the need to know 
some particular operator norm in advance. Theoretical convergence theorems are presented. 
We then apply the proposed methods to solve a total variation regularization model, in which 
the L2 data error function is added to the LI data error function. The main advantageous 
feature of this model is its capability to combine different loss functions. The numerical re¬ 
sults obtained for computed tomography (CT) image reconstruction demonstrated the ability 
of the proposed algorithm to reconstruct an image with few and sparse projection views while 
maintaining the image quality. 

Keywords: Sparse optimization; Proximity operator; Saddle-point, problem; CT image re¬ 
construction. 
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1 Introduction 

In this paper, we consider solving the following convex optimization problem: 

i 

mifi ^ Fi(Kjx) + G(x), (1.1) 

i=1 

where l is an integer, X and {Yj } l i=1 are Hilbert spaces, the functions {Fi}[ =l and G belong 
in T 0 (U) and T 0 (X), respectively, and Kj : X —> Yi is a continuous linear operator for i = 
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1, 2, • • • , l. Here and in what follows, for a real Hilbert space H , Yq{H) denotes the collection 
of all proper lower semi-continuous (LSC) convex functions from H to (—oo, +oo]. Based 
on the assumptions of problem (1.1), the functions (T) • iC)i<i<z may be used to model the 
data fidelity term, including smooth and non-smooth measures, and G could be the indicator 
function of a convex set or fh-norm, for example. Therefore, the optimization model (1.1) 
would be able to accommodate a combination of different data error functions. 

In particular, if l — 1, then problem (1.1) is reduced to the following 

min F(Kx) + G(x), (1.2) 

x&X 

where F E T 0 (F), G E T 0 (X), and K : X —> Y is a continuous linear operator. Under the 
assumption that the proximity operator of F* and G are easy to compute (i.e., it either has 
a closed-form solution or can be efficiently computed with high precision), Chambolle and 
Pock [1] proposed a primal-dual proximity algorithm to solve problem (1.2). They proved 
the convergence of the proposed iterative algorithm in finite dimensional Hilbert spaces. They 
also pointed out the relationship between the primal-dual proximity algorithm and other exist¬ 
ing algorithms, such as extrapolational gradient methods [2], the Douglas-Rachford splitting 
algorithm [3], and the alternating direction method of multipliers [4], Further, in [5], they 
introduced a precondition technique to compute the step size of the algorithm automatically. 
Numerical experiments showed that the preconditioned primal-dual proximity algorithm out¬ 
performs the primal-dual proximity algorithm in [1], He and Yuan [6] studied the convergence 
of the primal-dual proximity algorithm by presenting this algorithm of Chambolle and Pock [1] 
in the form of a proximal point algorithm in infinite dimensional Hilbert spaces. Condat [7] 
also obtained the convergence of the primal-dual proximity algorithm but from a different 
point of view, namely by studying the following optimization problem: 

min P(x) + G(x) + F(Kx ), (1.3) 

X 

where P : X —> R is convex, differentiable, and its gradient is Lipschitz continuous, and 
G, F, and K are the same as in problem (1.2). If P(x) = 0, then problem (1.3) reduces 
to problem (1.2). Condat [' ] proposed an efficient iterative algorithm for solving (1.3) and 
also proved its convergence based on Krasnoselskii-Mann iteration methods. The primal-dual 
proximity algorithm is a special case of Condat’s algorithm by setting P(x) = 0. Further, 
Condat proved the convergence of the primal-dual proximity algorithm in finite dimensional 
spaces where the parameters were relaxed from ctt\\K || 2 < 1 to ar\\K || 2 < 1. These are very 
useful results because it becomes possible to fix one parameter in the algorithm, allowing the 
other parameter to be tuned in practice. 

If we let Fo(x) = G(x),K 0 = /, then the problem (1.1) can also be formulated as follows, 

i 

min Fi{Kix). (1.4) 

X J 

2—0 


2 


Setzer et al. [8] proposed to use an alternating split Bregman method [9] to solve the problem 
(1.4) and proved [10,11] that this method coincided with the alternating direction method 
of multipliers, which can be interpreted as a Douglas-Rachford splitting algorithm applied 
to the dual problem. However, this iterative algorithm always incorporates linear equations, 
which are required to be solved either explicitly or approximately. Condat [7] considered the 
following general composite optimization problem, 

i 

min Fi(Kix) + G(x) + Q(x), (1.5) 

X ZJ 

2=1 

where the linear operators {Ki} l i=1 , the functions {F)}( =1 , and G are the same as in problem 

(1.1) , apart from the fact that the function Q(x) is also convex, differentiable, and displays a 
Lipschitz continuous gradient. He obtained an iterative algorithm to solve problem (1.5) by 
recasting it as problem (1.1) using the product spaces method. The iterative parameters in the 
algorithm introduced in [ ] rely on the estimation of the operator norm || ]>T =1 K*K t \\, which 
may affect its practical use. To overcome this disadvantage, we propose a preconditioned 
iterative algorithm to solve problem (1.1), where the iterative parameters are calculated self- 
adaptively. If the function Q(x) is equal to the least-squares loss function, i.e., Q(x) = 
\\\Ax — b |||, then problem (1.5) could be viewed as a special case of problem (1.1). 

The primal-dual algorithm is a very flexible method to solve the optimization problem 

(1.2) , which has wide potential application in image restoration and image reconstruction, for 
example, [12-18]. Sidky et al. [19] applied the primal-dual proximity algorithm introduced by 
Chambolle and Pock [1,5] to solve various convex optimization problems. For example, 

min h\Ax - b\\h (1.6) 

x Z 

min -||Ae — 6|||, s.t.,x>0; (1.7) 

x 2 

min h\Ax-b\\l + A||x|| ry ; (1.8) 

x Z 

min \\Ax — 6||i + A||:k||tv; (1-9) 

X 

min KL(Ax,b ) + A||x||tv, (1.10) 

X 

where || • ||i represents the tf-norm, || • || 2 represents the f? 2 -norm, || • \\ TV denotes the total 
variation semi-norm, KL(-, •) denotes the Kullback-Leibler (KL) divergence, and A > 0 is the 
regularization parameter balancing the data error term and the regularization term. The least- 
squares data error term is used widely in computed tomography (CT) image reconstruction. 
It is modeled by adding Gaussian noise to the collected data and the LI data loss function has 
the advantage of reducing the impact of image sampling with large outliers. They studied the 
application of this convex optimization problem in CT image reconstruction to demonstrate 
the performance of these different models under appropriate levels of noise. The numerical 
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results showed that the primal-dual proximity algorithm can efficiently solve these problems 
and it exhibited very good performance in terms of reconstructing simulated breast CT data. 
The work of Sidky et al. [19] motivated us to introduce a general composite optimization 
problem for image reconstruction. Then, the above optimization problem (1.8) and (1.9) 
would be a special case of our proposed optimization problem. 

The purpose of this paper is to introduce a splitting primal-dual proximity algorithm for 
solving problem (1.1) and to propose a preconditioning technique to improve the performance 
of this algorithm. In addition, theoretical convergence theorems are also provided. We then 
demonstrate the performance of our proposed algorithms by applying them to solve a com¬ 
posite optimization problem, which has wide application in the image restoration and image 
reconstruction fields. 

The rest of this paper is organized as follows. In Section 2, we provide selected background 
information on convex analysis. In Section 3, we briefly review the primal-dual proximal 
algorithm, together with one of its preconditioned techniques. These iterative algorithms 
are employed to develop a splitting primal-dual proximal algorithm for solving problem (1.1) 
and the results are presented in Section 4. In Section 5, we apply the proposed iterative 
algorithm to solve a particular convex optimization model, which is relevant to the CT image 
reconstruction problem. We use numerical results to illustrate the capabilities of our proposed 
algorithm in Section 6. Finally, we offer some conclusions. 


2 Preliminaries 


In this section, we introduce some definitions and notations. Let H be a real Hilbert space, 
with its inner product (-, •) and norm || • || = (-, -) 1 / 2 . We denote by T 0 (H) the set of proper 
lower semicontinuous (LSC), convex functions from H to (—oo,+oo]. 


Definition 2.1. Let / be a real-valued convex function on H , for which the proximity operator 
proxf is defined by 

proxf : H —>■ H 

1 ( 2 . 1 ) 

x i ^ argmin f(y) + -||x - y\\\. 

y£H A 


Let C be a nonempty closed convex set of H. The indicator function of C is defined on H 


as 


I'dx) 


0, if x e C, 
Too, otherwise 


( 2 . 2 ) 


It is easy to see that the proximity operator of the indicator function is the projection operator 
onto C. That is, prox Lc (x ) = Pc(x), where Pc represents the projection operator onto C. 
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For some simple functions, there is a closed-form solution of their proximity functions and 
we provide several examples. For other examples of proximity operators with closed-form 
expression, we refer the readers to [20] for details. 

Example 2.1. Let A > 0 and u G R N , then 


Iproxxw-w^u)^ = max(|uj| - A, 0)sign(ui), 

The proximity operator of Id-norm || • ||i is often referred to as a soft-thresholding operator, 
and denoted by Soft(u, A), i.e., Softfu, A) = prox^wfiv). 

Example 2.2. Let A > 0 and u G R N , then 

XL 

prox\\\.\\ 2 {u) = max(||d| 2 - A, 0)---—. 

\\u \\2 

Example 2.3. Let u G R 2JV ? then the norm ||u||i ,2 is defined by 

N 

ll«lll,2 = X] \J U i+ U N+i■ 

2=1 


Let A > 0, x G R 2iV and ||xj|| 2 — \J x i + x N+i’ then pr ox a||.|| 12 (x) can be expressed as 


P^AlHkaW 


max{||xi||2 - A, 0}- 


x i 


\ x i 2 


max{||xj ||2 - A, 0} 


%N+i 

\\xih 


, i = 1,2, - ■ • , N. 


(2.3) 


We also prove some proximity functions which will be used in the following sections. 

Lemma 2.1. For any u G R N and b G R N , define the function f(x ) = ||x — 6||i, then the 
proximity operator of pr ox \f(u) is given by 

prox\f(u) = b + Soft(u — b, A). (2.4) 

Proof. By the definition of the proximity operator, we know that 


prox\f(u ) = argmin < - ||x — w ||2 + A||a; — 6||i 

* (2 

Let x — b = y, then the above minimization problem reduces to 


arg min < -\\y + b - u\\ 2 2 + A||y||i 
y [2 

= argmin { h\y - (u - b) || 2 + A||j/||] 


= soft(u — b, A). 

Then, prox\f(u) = b + soft{u — 6, A). 


□ 
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Lemma 2.2. For any u G R N and b G R N , define the function f(x) = 
proximity operator of prox\f{u) is given by 

u + Xb 

1 +A ‘ 

Proof. By the definition of the proximity operator, we know that 

prox\f{u ) = argmin j -||x — + -A||x — 6||| 1. 

x y2 2 J 

The first-order optimality condition of (2.6) reduces to 



\\\ x ~ then, the 


(2.5) 


( 2 . 6 ) 


0 = (x — u) + A(x — b ), 

Then x = -yyy- That is, 

, u + Xb 
prox\ f [u) = —— . 

□ 


Similarly, by Example 2.2, we can deduce the proximity operator of function f(x) = 
\x — b\\ 2 that 


prox x \\-b\\ 2 (u) = argmin <j ^\\x - u\\l + A||x - b\\ 2 


= max i \\u 


{||«-f>l|2 


A,0}^xJL + ,,. 


(2.7) 


ll U - b\\ 2 

Recall that the Fenchel conjugate of a given function / is defined as f*(x) = sup u {< x,u > 
—/(«)}. The proximity operator of a function / and its Fenchel conjugate f* are connected 
by the celebrated Moreau’s identity [21]: 

a /* 1 


( 2 . 8 ) 


x = prox\f(x) + Xproxiffi —). 

The well-known Rudi-Osher-Fatemi (ROF) [22] total variation model is one of the most 
popular image denoising models. The ROF model is given by 

1 , 


arg mm 


\x — u || 2 + A||x|| T y 


(2,9) 


where u G R d denotes the noisy image and ||( t||tv is the total variation of x. Because total 
variation regularization can preserve the edges of images, it has been widely used in the image 
restoration and image reconstruction fields. The total variation norm || x||tv can be viewed as 
the combination of a convex function with a linear transformation. In fact, let B denote an 
N x N matrix defined by the following: 

\ 


/-I 


B : = 


V 


6 


-1 1 

°/ 








and define matrix D to be 2N 2 x N 2 , which could be seen as a finite difference discretization 
of an image from historiza and verti, 


D : = 


I ®B 
B ® I 


( 2 . 10 ) 


where / is the TV x N identity matrix and the notation P ®Q denotes the Kronecker product 
of matrices P and Q. 

Let x be an image in R N ~. Two definitions of total variation have appeared in the literature. 
The first is referred to as anisotropic total variation (ATV) and is defined by the formula 


x\\ TV '■= <p(Dx) = \\Dx\\i, 


( 2 . 11 ) 


where <p(z) : = ||*||i,z G R 2N2 , whereas the second definition of total variation is known as 
isotropic total variation (ITV) and is defined by the equation 


\x\\tv — 


<p{Dx) = \\Dx\\ 1)2 = Y 


i— 1 


(Dx)i 

(Dx'jn+i 


where Lp : R 2N ~ —> R as 


N 2 


v{z) ■= Y 


i =1 


Zi 

ZN 2 +i 


,zeR 


2N 2 


( 2 . 12 ) 


(2.13) 


3 A Primal-dual Proximity Algorithm for Solving (1.2) 

In this section, we recall selected primal-dual proximity algorithms for solving problem (1.2). 
First, the corresponding dual optimization problem of (1.2) is 


max -F*(y) - G*(-K*y). (3.1) 

y 

Here, F* and G* represent the Fenchel conjugate of F and G, respectively. Combining the 
primal problem (1.1) and dual problem (3.1) leads to the following saddle-point problem: 

min max (. Kx,y) + G(x) — F*(y). (3.2) 

x y 

Let problem (3.2) have a solution (x,y), then it satisfies the following variational inclusion 


f 0 \ / K*y + dG(x), \ 

y 0 j y -Kx + dF*y, ) 

where OF* and dG are the subgradients of the convex functions F* and G. 


(3.3) 
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Chambolle and Pock [1] proposed a primal-dual proximity algorithm for solving (3.2). 
Choosing ( x°,y° ) G X x Y and x° = x°, the iterative sequences {x k } and {y k } are given by 


y k+1 = prox a F*(y k + aKx k ), 
x k+1 = prox T c{x k — rK*y k+1 ), 
x k+ 1 = x k+1 + 9{x k+1 - x k ), 


(3.4) 


where cr, r > 0, and 9 G [0,1]. They proved its convergence with the requirement of 9 = 1 and 
<jt < 1/1| A"|| 2 in finite dimensional spaces. 

Define y 1 = prox a F*(y° + crKx 0 ), then the iterative sequence (3.4) can be rewritten 


as 


x k+1 = prox T c{x k — rK*y k+1 ), 
y k+2 = prox aF *{y k+1 + <xK(x k+l + 9(x k+1 — x k ))). 

Letting y k+1 = y k : we can simply rewrite the iterative sequence (3.5) as follows 

x k+1 = prox T c{x k — rK*y k ), 
y k + l = prox aF *{y k + crK(x k+1 + 6(x k+1 - x fc ))). 


(3.5) 


(3.6) 


The only difference between iterative sequences (3.4) and (3.6) is the initial value of y°. 
Because these iterative algorithms do not depend on the initial value of x° and y°, they are 
actually equivalent. Therefore, the details of the primal-dual proximity algorithm introduced 
by Chambolle and Pock [1] are actually those provided in Algorithm 3.1. 


Algorithm 3.1 Primal-dual proximity algorithm for solving (1.2) 
Initialization: Give r, a > 0, 9 G [0,1] and choose (x°,y°) Glx Y; 
For k = 0,1, 2, • • • do 

1. x k+1 = prox TG {x k — rK*y k ), 

2. y k+1 = prox aF *{y k + aK(x k+1 + 9(x k+l — x fc ))). 
end for when some stopping criterion is satisfied 


Theorem 3.1. ( [1]) Let 9 = 1 and the parameters <j,t satisfy crr||A"|| 2 < 1. Then, the 
sequence ( x k ,y k ) generated by Algorithm 3.1 converges weakly to an optimal solution (x*,y*) 
of the saddle-point problem (3.2). 

Remark 3.1. Condat [7] proved that the condition crr||A"|| 2 < 1 in Theorem could be relaxed 
to err || K || 2 < 1 in finite dimensional spaces. 

The convergence of Algorithm 3.1 relies on the operator norm ||A'||, which is not easy 
to estimate. Pock and Chambolle [5] attempted to address this shortcoming by proposing 
a precondition technique for Algorithm 3.1 where the step sizes r and cr are replaced by 
two symmetric and positive definite matrices, respectively. They also suggested a practical 





approach for obtaining the matrices T and £, thereby satisfying the convergence requirement 
of Theorem 3.2. 


Algorithm 3.2 Preconditioned primal-dual proximity algorithm for solving (1.2) 

Initialization: Choose symmetric and positive def ini te matrices T and £, 6 G [0,1], 
(. x°,y°) E X x Y. 

For k — 0,1, 2, • • • do 

1. x k+1 = proxTc(x k ~ T K*y k ), 

2. y k+1 = prox-sF*{y k + S/t (x k+1 + 6(x k+l — x k ))). 

end for when some stopping criterion is satisfied 


Theorem 3.2. ( [5]) Let 6=1 and let T, £ be symmetric and positive definite matrices such 
that ||E^JiT 2 || < 1. Then, the sequence ( x k ,y k ) generated by Algorithm 3.2 converges weakly 
to an optimal solution (x*,y*) of the saddle-point problem (3.2). 

As mentioned in [5], the matrices £ and T could be any symmetric and positive matrices. 
However, it is a prior requirement of Algorithm 3.2 that the proximity operators are simple. 
Thus, they proposed to choose £ and T with some diagonal matrices which satisfy all these 
requirements and guarantee the convergence of the algorithm. 


Lemma 3.1. ( [5]) Let T = diag(r), where r = ( ti , t 2 ,-- - , r n ) and £ = diag(a), where 
a = (<ri, • • • , a m ). In particular, 


T i = 


\k- 

L^i=i P v i,. 


1 2—a ’ 


= 


I K 

2^7=1 


h3\ 


then for any a G [0,2], 


|£2 A'Ta I 


WT^K^xP 

SU P -n —ttq- 

x€X,x^0 Ill'll 


< 1. 


In the next section, we shall see how to judiciously use the primal-dual proximity al¬ 
gorithms, including Algorithm 3.1 and Algorithm 3.2, to derive a variety of flexible convex 
optimization algorithms for the proposed problem (1.1). 


4 A Splitting Primal-dual Proximity Algorithm for Solv- 
ing (1.1) 

In comparison with the well-known forward-backward splitting algorithm and the alternating 
direction method of multipliers for solving problem (1.2), the forward-backward splitting al¬ 
gorithm needs one of the functions F or G to satisfy the differential and requires a Lipschitz 
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continuous gradient, and the alternating direction method of multipliers always involves a sys¬ 
tem of linear equations as its subproblem. In contrast, every subproblem of the primal-dual 
proximity algorithm is easy to solve and does not require any inner iteration numbers. This 
motivated us to extend the primal-dual proximity algorithm to solve the general optimization 
problem (1.1) 

First, we present the main iterative algorithm to solve problem (1.1) and prove its conver¬ 
gence as follows. 

Algorithm 4.1 A splitting primal-dual proximity algorithm for solving (1.1) 

Initialization: Give r, a > 0 such that ra < 1/1| Y2\=i ^*Ki\\, choose (x°, y®, y%, ■ ■ ■ , yf) G 
X x Yi x Y 2 x ■■■ x Yi‘ 

For k — 0, 1, 2, • ■ ■ do 

1. x k+i = prox rG (x k - t Y?i=i K iUi)i 

2. y k+1 = prox aF *(y k + crKf2x k+1 — x k )), for i — 1, 2, • ■ ■ ,1. 
end for when some stopping criterion is satisfied 


The dual problem of (1.1) is 


max — G* 

S/1,— ,S/i 



i 

%— 1 


and the saddle-point problem is 


min max 


l l 

Yl {KiX, y^ + G{x) - Y F i(.Vi)- 


i =1 


2—1 


(4.1) 


(4.2) 


Theorem 4.1. Let o > 0 and r > 0 be the parameters of Algorithm f.l, then the iterative 
sequence ( x k , y k , • • • , y k ) converges weakly to an optimal solution (x*, y*, - ■ ■ , y*) of the saddle- 
point problem (4.2). 


Proof. First, we convert the optimization problem (1.1) into the form of problem (1.2) by using 
a product spaces technique. For this purpose, we introduce the notation y := (y±, ■ ■ ■ ,y() for 
an element of the Hilbert space YY\ x ■ • • Yj, equipped with the inner product (y, z) = 
Yi=i(y^ z i)- For any y G Y, we define the function F G r 0 (F) by F(y) = Yi=i F i(Vi) and 
the linear function K : X —> Y by K:r := (Kix, • • • , Kix), i.e., 

( K \ \ 



\Ki 
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Then, we know that (F o K)(x) = Fi(Kix) + F 2 (K 2 x) + ••• + Fi(R\x). Therefore, the 
optimization problem (1.1) can be reformulated as the following 

min (F o K)(x) + G(x), 

X 

which is the exact optimization problem (1.2). Taking 6 = 1 in Algorithm 3.1, we obtain the 
iterative sequence for solving (1.1). 

f % k+l = prox TG (x k - rKV), 3 

\ y k+1 = prox ar { y k + aK{2x k+1 - x k )). 

By Theorem 3.1, we can conclude that the iterative sequence ( x k , y k , • • • , y k ) converges weakly 
to an optimal solution (x*,y*,--- ,y*) of the saddle-point problem (4.2). Further, as the 
function F is separable with variables, the Fenchel conjugate of F (u) = F*(ui) + Ff(u 2 ) + 

••• + F*(u{), for u := (ui,u 2 ,--- ,u{) E Y. Then, the proximity operator prox p* can be 
calculated independently, i.e., prox a p*(u) = (prox a F*(ui), ■ ■ ■ , prox a F*(ui))- Therefore, we 
can split the iterative sequence (4.3) and obtain the corresponding Algorithm 4.1 as stated 
before. 

□ 

Remark 4.1. Based on the results of Condat, the parameters can be relaxed to rcr < 1/|j KjK t 

in a finite dimensional Hilbert space. 

Algorithm 4.2 A preconditioned splitting primal-dual proximity algorithm for solving (1.1) 

Initialization: Choose symmetric and positive definite matrices T and Si, for i = 

1,2, ■■■,/, (x 0 ,y° 1 ,yl---,y' j >)eXxY 1 xY 2 x---xY l - 
For k = 0,1, 2, • • • do 

1. x k+1 = prox TG (x k - T K*y k ), 

2. y ? fc+1 = proxY, iF *{yi + S i A"j(2x A:+1 — x k )), for i = 1, 2, • • • , /. 
end for when some stopping criterion is satisfied 


Based on Lemma 3.1, we are able to suggest a practical way to choose the matrices T and 
(S k )(. =1 , respectively. 

Lemma 4.1. Let T = diag(r), where r = (ti,t 2 ,--- , r n ) and S k = diag(o k ), where a k = 
(a k , • • • , cr^ ), for k — 1, 2, • • • , l. In particular, 

1 t 1 


T i ~ ^l x~~\m 


EUEr=il^(bj)l 


2-q 


Oi = 


zu\ K ^j)\ a ' 


then for any a E [0, 2], 


| T 2 ACS * 1 2 1| “ = sup 


\T2KY2X\ 


where K = (Ad; K 2 , • • • ; K,) and S = (Sp S 2 ; • ■ ■ ; S z ). 


<1, 
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Remark 4.2. The advantages of our approach are the following: 

(i) There are limited assumptions for the functions {Fi} l i=1 and G; 

(ii) There is no inner iteration involved in the main process; 

(iii) The iterative parameters are easy to select. 


5 Applications 

In this section, we consider solving the following constrained composite optimization problem, 

min \w x \\Ax - b\\\ + w 2 \\Ax - b\h + A||o;||tv, 

1 2 (5.1) 

s.t. ieC, 

where x E R n , A E R mxn , b E R m , C is a closed convex set, w±,w 2 E [0,1] satisfying 
w\ + w 2 — 1, A is the regularization parameter, and ||x||rv denotes the total variation (TV) 
norm. 

It is easy to see that problem (5.1) includes the well-known L2 + TV (1.8) and LI + TV 
(1.9) problem as its special case. If w 2 = 0, then it reduces to the constrained L2 + TV 
problem, and if w\ = 0, then it reduces to the constrained LI + TV problem, respectively. 

In the following, we show that the optimization problem (5.1) is a special case of problem 
(1.1). The flexibility of problem (1.1) lies in the ease with which constraints can be incorpo¬ 
rated into this problem. It is observed from the definition of the total variation semi-norm 
(2.11) and (2.12) that || x||tv — (tp ° D)(x), with ip a convex lower semicontinuous function 
and D a real matrix. Then, the optimization problem (5.1) can be reformulated as follows. 

min \wxl\Ax - b\\l +w 2 \\Ax - b\h + Xip(Dx) + t c (x), (5.2) 

x 2 

where Lq is the indicator function of the closed convex set C. 

To match the formulation (1.1) with the problem at hand (5.2), we follow two approaches 
to obtain its solution. 

Method I. Let G(x) = 0, Fi(v) = — 6|||, K 3 = A, F 2 (v ) = w 2 \\v — 6|| l5 K 2 = A, 

F 3 (v) = A <p(v), K 3 = D, F±(y) = Lc(v), and K 4 = I. Then, we can apply Algorithm 4.1 to 
solve the problem (5.2). The detailed structure of the algorithm is presented as follows. 
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Algorithm 5.1 A first class of splitting primal-dual proximity algorithm for solving problem 

(5.1) _ 

Initialization: Give r, a > 0 such that ra < 1/||2A T A + D T D + J||, choose 


(x 

O 

l-'O 

2 / 2 °, 1 / 3 °, 1/4) e Ax Vx y 2 x 

Y, x Y 4 

For k = 

= 0,1,2,- - - do 


1. 


= x k — r(A T y k + A T y% + D T y k + ; 

2. 

v\ +l 

= prox aF * (y k + aA(2x k+1 

- x k )), 

3. 

y k 2 +1 

= prox aF * (y k + aA(2x k+1 

- x k )), 

4. 

y k 3 +1 

= prox aF * (y k + aD{2x k+1 

-x k )), 

5. 

yt +1 

= prox aF * (y k + cr( 2x k+1 — 

x k )). 


end for when some stopping criterion is satisfied 


In the following, we explain that every subproblem of Algorithm 5.1 can be calculated 
explicitly. In fact, the proximal operator of F* is determined via one of the functions F 
obtained by using Moreau’s identity (2.8). 

First, according to Moreau’s identity (2.8) and Lemma 2.2, we have 

yi +1 = prox aF *(a(-y k + A(2x k+1 - x k ))) 
a 

= a (I — proxi F )(—y k + A(2x k+1 — x k )) 

^ a 

= (y k + cA{2x k+1 - x k )) - ■ a ^ ( y k + aA(2x k+1 - x k ) + w x b) 

= Wl (y^ + aA(2x k+l - x k ) - ab). (5.3) 

vji + a 

Second, by Lemma 2.1, we can obtain the proximity operator of function aF^. That is 
y k 2 +l = prox aF ;{a(^y k + A{2x k+1 - x k ))) 

= a (I — proxi F )(—y% + A{ 2x k+1 — x k )) 

° a 

= (y k + aA(2x k+1 - x k )) - a(b + Soft{-y k + A(2x k+1 - x k ) - b, —)). (5.4) 

a cr 

Third, by taking into account the definition of the TV norm, the function F$(v) is equal 
to |M|i or ||u||i i2 , respectively. Then, the proximity of aF 3 * can also be calculated by 

y k+1 = prox aF *(a(—y k + D(2x k+1 - x k ))) 
a 

= a(I - proxi F )(—y k + D{2x k+l - x k )). (5.5) 

v 3 <J 

Then, for the anisotropic TV (ATV), we have 

Vs +1 = (l/s + aD (2x k+1 ~ x k )) - aSoft(-y k + D(2x k+1 - x k ), -). (5.6) 

a a 
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and for the isotropic TV (ITV), we also have a closed-form solution due to (2.3). 

Fourth, because the proximity of indicator function t c is equal to the projection operator 
onto the set C, we obtain 

V 4 +1 = prox aF *(a(—y k + ( 2x k+1 - x k ))) 
a 

= a (I - proxi F )(-y k 4 + ( 2x k+1 - x k )) 
a 

= (y k + a(2x k+1 - x k )) - aP c (-y k + ( 2x k+1 - x k )). (5.7) 

a 

Therefore, the original problem (5.1) is decomposed into an iterative sequence consisting 
of subproblems which are much easier to solve, each one with a closed-form solution. 

Next, we follow another approach to solve problem (5.2). 

Method II. Let G(x) = tc( x ), F i( w ) = — b\\l, K\ = A, F 2 (v) = w 2 \\v — 6||i, K 2 = A, 

Fs(v) = <p(v), and K3 = D. Then, we can apply Algorithm 4.1 to solve the problem (5.2), 

Algorithm 5.2 A second class of splitting primal-dual proximity algorithm for solving prob¬ 
lem (5.1) 

Initialization: Give r, a > 0 such that tct < ||2A r A + D T D ||, choose (a: 0 , 2 /°, 2 /°, 2 / 3 , 2 / 4 ) e 
A x Y 1 x Y 2 x V 3 x Y 4 ; 

For k — 0,1, 2, • ■ ■ do 

1. x k+l = P c (x k — r(A T y k + A T y k + D T y k ), 

2. y k+1 = prox aF *(y k + aA{2x k+1 — x fc )), 

3. y 2 +1 = prox aF *(y k + aA(2x k+1 — x k )), 

4. y k+1 = prox aF *(y k + crD(2x k+1 — x k )). 

end for when some stopping criterion is satisfied 


Remark 5.1. (1) The difference between Algorithm 5.1 and Algorithm 5.2 is that they treat 
the constraint C differently. In Algorithm 5.1, the indicator function is set as the combination 
of a convex function with an identity matrix, whereas in Algorithm 5.2, the indicator function 
is defined as the function G(x) in problem (1.1). 

(2) Algorithm 5.1 and Algorithm 5.2 use a fixed step size, which depends on the estimation 
of some matrix norm. This norm is its largest singular value, which can be computed via the 
power method in practice. 

Based on the preconditioned splitting primal-dual proximity algorithm (Algorithm 4.2), 
we obtain the corresponding preconditioned Algorithm 5.1 and Algorithm 5.2, respectively. 
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Algorithm 5.3 A first class of preconditioned splitting primal-dual proximity algorithm for 
solving problem (5.1) 

Initialization: Follow the Lemma to define the matrices T and Ej, i = 1,2, 3 , 4; Choose 

(x°, Vi, vl y° 3 , y° 4 ) ex xY 1 xY 2 x¥ 3 x y 4 ; 

For k = 0,1, 2, • • • do 

1. x k+1 = x k - T (A T y\ + A T y k + D r y\ + y k ), 

2. y k+1 = prox-£ lF *(y k + E 1 A(2x fc+1 — x k )), 

3 . y k+1 = proxY, 2 F*(y 2 + Y 2 A(2x k+l — x k )), 

4. y k+1 = proxY, 3 F*{yt + E 3 -D(2z fe+1 - x fc )), 

5 . y k+1 = prox-£ 4 p*{y k + Y 4 (2x k+1 - x k )). 

end for when some stopping criterion is satisfied 


Similarly, we can provide preconditioned Algorithm 5.2 as follows. 

Algorithm 5.4 A second class of preconditioned splitting primal-dual proximity algorithm 
for solving problem (5.1) 

Initialization: Follow the Lemma to define the matrices T and Ej, % — 1,2, 3; Choose 

(x°,y 0 1 ,y 0 2 ,y° 3 )eXxY 1 xY 2 xY 3 - 

For k = 0,1, 2, • • • do 

1. x k+1 = P c (x k - T (A T y k + A T y k + D T y k ), 

2. y k+1 = prox^FfijJi + Y 2 A(2x k+l — x k )), 

3. y k+l = prox-s 2 F*(y 2 + Y 2 A(2x k+1 - x k )), 

4- y 3 +1 = proxx 3 F*(yl + Y 3 D(2x k+1 — x k )). 
end for when some stopping criterion is satisfied 


Remark 5.2. (1) For the unconstrained optimization problem (5.1), i.e., C := R n , Algorithm 
5.1 and Algorithm 5.2 are equivalent, as are Algorithm 5.3 and Algorithm 5.4. 

(2) In comparison with Algorithm 5.1 and Algorithm 5.2, Algorithm 5.3 and Algorithm 
5.4 can be used to obtain the iterative parameters self-adaptively without the need to know 
the respective matrix norm. 

6 Numerical experiments 

In Section 5, we derived an instance of the proposed splitting primal-dual proximity algorithms. 
To demonstrate the performance of these proposed algorithms, we apply them to the test 
problems described in Section 5. All experiments were performed using MATLAB on a Lenovo 
Thinkstation running Windows 7 with an Intel Core 2 CPU and 4 GB of RAM. 

Two-dimensional tomography test problems were created by using AIRTools [23], which 
is a MATLAB software package for tomographic reconstruction that was developed by Prof. 


15 








Figure 1: Original Shepp-Logan phantom 


Perchristian Hansen and his collaborators. The package includes two core functions ’’fanbeam- 
tomo” and ’’paralleltomo”, which were used to generate the simulation data. For example, 
the function ’’paralleltomo” creates a 2D tomography test problem using parallel beams. 

[A,b,x] = paralleltomo(N, theta,p), (6.1) 


where the input variables are as follows: N is a scalar denoting the number of discretization 
intervals in each dimension such that the domain consists of N 2 cells, theta is a vector con¬ 
taining the angles in degrees (default: theta = 0:1: 179), and p is the number of parallel 
rays for each angle (default: p = round(\/2N)). The output variables are the following: A 
is a coefficient matrix with N 2 columns and length(theta) * p rows, b is a vector containing 
the projection data, and x is a vector containing the exact solution with elements between 0 
and 1. We refer the reader to the AIRTools manual for further details. The test image is the 
standard benchmark Shepp-Logan phantom (see Figure 1.) with size 256 x 256 and pixels are 
assigned values varying from 0 to 1. 

We measured the quality of recovered images by using the criterion signal-to-noise ratio 
(SNR), 

8 ||j ^ 

r II 2 / ’ 

•^rec || 2 / 

where x true is the original image, x rec denotes the reconstructed image obtained by using the 
iterative algorithms. The iterative process is stopped when the relative error 


SNR = log 10 


11 -I'l.rv 


||*£trae 


\x k+1 — || 2 

\\x k \\ 2 


< e, 


where e is a given small real number. 

We compare the performance of Algorithm 5.1 and Algorithm 5.2, as well as that of Al¬ 
gorithm 5.3 and Algorithm 5.4. The anisotropic TV (ATV) and isotropic TV (ITV) perform 
similarly; therefore, we use the (ATV) regularization term in the following test. The initial 
values of the variables are set to zero in all iterative algorithms. In Algorithm 5.1 and Al¬ 
gorithm 5.2, the induced norm of the operator ||2A T A + D T D + J|| and ||2A T A + D T D\\ are 
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estimated using the standard power iteration algorithm. For the preconditioned algorithm, 
the parameter a was set to one in Lemma 4.1. 

The projection angles in (6.1) are set as theta = 0 : 10 : 179. A total of 18 angles were used 
in the simulation test. The p value is set as default. Then, the system matrix A is 6516 x 65536, 
which is an under-determined matrix. Both Gaussian and impulsive noise are added to the 
projection data vector b. The performance of Algorithm 5.1, Algorithm 5.2, Algorithm 5.3, 
and Algorithm 5.4 are listed in Table 1 and Table 2, respectively. The entries indicate that 
the algorithm failed to reduce the error below the given tolerance e within a maximum number 
of 40000 iterations. 


Table 1: Comparison of the performance of Algorithms 5.1, 5.2, 5.3, and 5.4 in terms of SNR 
and iteration numbers with non-negativity constraints, i.e., C = {x\x > 0} 


Regularization 

Methods 

e = 1CT 3 

e = 10“ 4 

e = 10" 5 

e = 10" 6 

Parameter 


SNR(dB)/k 

SNR(dB)/k 

SNR{dB)/k 

SNR(dB)/k 


Algorithm 5.1 

18.66/3253 

24.31/15804 

25.25/36598 

25.34/- 

A - 0 6 

Algorithm 5.3 

24.40/430 

25.75/1236 

26.17/5852 

26.18/21265 


Algorithm 5.2 

19.08/1882 

25.65/14659 

26.11/30009 

26.15/- 


Algorithm 5.4 

24.52/378 

25.86/1154 

26.17/4634 

26.18/16433 


Algorithm 5.1 

18.73/3399 

25.49/17359 

26.73/38831 

26.78/- 

A — 0 8 

Algorithm 5.3 

25.33/465 

27.11/1346 

27.67/5805 

27.73/22503 


Algorithm 5.2 

19.01/2158 

26.84/17613 

27.53/36010 

27.55/- 


Algorithm 5.4 

25.43/404 

27.27/1286 

27.67/4559 

27.74/16967 


Algorithm 5.1 

18.67/3754 

27.41/19697 

28.87/- 

28.87/- 

A - 1.2 

Algorithm 5.3 

26.57/495 

29.10/1473 

29.86/4462 

30.09/21411 


Algorithm 5.2 

18.31/2593 

28.29/20614 

29.47/- 

29.47/- 


Algorithm 5.4 

26.65/445 

29.13/1412 

29.87/3930 

30.11/19898 


Algorithm 5.1 

18.37/3921 

28.31/21132 

29.78/- 

29.78/- 

A - 1 6 

Algorithm 5.3 

26.89/502 

30.00/1521 

30.99/4223 

31.43/23334 


Algorithm 5.2 

17.78/2791 

28.79/21704 

30.32/- 

30.32/- 


Algorithm 5.4 

27.00/471 

30.05/1476 

30.98/3751 

31.44/20318 


Algorithm 5.1 

18.26/4006 

28.10/21545 

29.80/- 

29.80/- 

A - 1 8 

Algorithm 5.3 

26.74/504 

29.92/1518 

31.06/4278 

31.59/20884 


Algorithm 5.2 

17.73/2920 

28.52/21850 

30.25/- 

30.25/- 


Algorithm 5.4 

26.80/478 

29.98/1490 

31.02/3889 

31.64/21790 


Algorithm 5.1 

17.92/3942 

27.73/21777 

29.40/- 

29.40/- 

A - 2 

Algorithm 5.3 

26.30/506 

29.53/1529 

30.76/4410 

31.33/20260 


Algorithm 5.2 

17.62/3073 

27.94/22010 

29.79/- 

29.79/- 


Algorithm 5.4 

26.32/487 

29.72/1552 

30.73/4134 

31.35/19732 


The results in Tables 1 and 2 indicate that preconditioned iterative Algorithm 5.3 and Al¬ 
gorithm 5.4 converge faster than iterative Algorithm 5.1 and Algorithm 5.2, respectively. The 
second class of splitting primal-dual proximity algorithms (Algorithm 5.2 and Algorithm 5.4) 
achieve higher SNR values than the first class (Algorithm 5.1 and Algorithm 5.3), respectively. 
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Table 2: Comparison of the performance of Algorithms 5.1, 5.2, 5.3, and 5.4 in terms of SNR 
and iteration numbers with box constraints, i.e., C = {x|0 < x < 1} 


Regularization 

Methods 

e = 10” 3 

e = 10 -4 

e = 10" 5 

e = 10" 6 

Parameter 


SNR(dB)/k 

SNR(dB)/k 

SNR(dB)/k 

SNR(dB)/k 


Algorithm 5.1 

19.20/2899 

24.86/14335 

26.05/35501 

26.15/- 

A - 0 6 

Algorithm 5.3 

24.65/380 

26.65/1208 

27.23/5966 

27.31/22942 


Algorithm 5.2 

19.08/655 

26.55/12137 

27.18/28035 

27.24/- 


Algorithm 5.4 

25.03/343 

26.78/1123 

27.25/4450 

27.30/16060 


Algorithm 5.1 

19.21/2969 

25.94/15457 

27.40/37330 

27.48/- 

A - 0 8 

Algorithm 5.3 

25.25/384 

27.83/1273 

28.65/5385 

28.77/22840 


Algorithm 5.2 

19.28/799 

27.58/14095 

28.47/32327 

28.55/- 


Algorithm 5.4 

25.53/353 

28.08/1244 

28.71/14688 

28.79/18123 


Algorithm 5.1 

19.08/3187 

27.46/17257 

29.09/- 

29.09/- 

A - 1 2 

Algorithm 5.3 

25.90/396 

29.23/1342 

30.22/14588 

30.48/22166 


Algorithm 5.2 

19.00/1132 

28.61/16171 

29.80/37411 

29.84/- 


Algorithm 5.4 

26.13/354 

29.30/1309 

30.22/3899 

30.50/19944 


Algorithm 5.1 

18.79/3346 

28.00/18658 

29.78/- 

29.78/- 

A - 1 6 

Algorithm 5.3 

26.11/413 

29.94/1422 

30.98/4097 

31.44/23096 


Algorithm 5.2 

18.52/1478 

28.96/17626 

30.38/- 

30.38/- 


Algorithm 5.4 

26.41/378 

29.97/1378 

31.00/3790 

31.45/20929 


Algorithm 5.1 

18.42/3366 

27.91/19126 

29.78/- 

29.78/- 

A - 1 8 

Algorithm 5.3 

25.89/415 

29.91/1462 

31.08/4373 

31.59/21341 


Algorithm 5.2 

18.38/1645 

28.78/18234 

30.29/- 

30.29/- 


Algorithm 5.4 

26.20/392 

29.92/1419 

31.03/3922 

31.63/21140 


Algorithm 5.1 

18.04/3432 

27.65/19592 

29.44/- 

29.44/- 

A - 2 

Algorithm 5.3 

25.53/422 

29.52/1488 

30.76/4380 

31.33/19802 


Algorithm 5.2 

18.31/1858 

28.13/18920 

29.81/- 

29.81/- 


Algorithm 5.4 

25.89/408 

29.56/1464 

30.73/4070 

31.35/20100 
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Figure 2: Reconstructed images obtained from Algorithms 5.1, 5.2, 5.3, and 5.4 with non¬ 
negativity constraints, respectively. 

In addition, the results in Tables 1 and 2 show that when the error tolerance decreases, 
the SNR value increases accordingly; however, this requires a larger number of iterations and 
is more time consuming. The regularization parameter also has an impact on the performance 
of these iterative algorithms. A large regularization parameter means that the total variation 
term is strongly penalized. We found the SNR value to increase as we increased the regular¬ 
ization parameter; however, the SNR value was observed to decrease when the regularization 
parameter exceeded the value of 2. 

A comparison between Tables 1 and 2 revealed that the SNR values of the reconstructed 
images are very similar for the given regularization parameter level. The reconstructed images 
are shown in Figure 2 and Figure 3, where the regularization parameter A = 1.8 and the 
tolerance e = 10 -6 . 

7 Conclusions 

In this paper, we proposed a splitting primal-dual proximity algorithm to solve the general 
optimization problem (1.1). As its iterative parameters rely on estimating some operator 
norm, this may affect its practical use. Thus, we introduced a precondition technique to 
compute the iterative parameters self-adaptively. Under some mild assumptions, we proved 
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Figure 3: Reconstructed images obtained from Algorithms 5.1, 5.2, 5.3, and 5.4 with box 
constraints, respectively. 

the theoretical convergence of both iterative algorithms. The methods proposed in this paper 
have been applied to the constrained optimization model (5.1), which has wide application in 
image restoration and image reconstruction problems. We verified the numerical performance 
of these iterative algorithms by applying them to CT image reconstruction problems. The 
numerical results were very promising. 

Although we have illustrated the use of our proposed methods in the context of a CT image 
reconstruction problem, the proposed methods can also be used to solve other application 
problems such as image deblurring and denoising, and statistical learning problems. 
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